Automatic Category Analysis ¶
In [ ]:
import pandas as pd
In [2]:
df = pd.read_csv("googleplaystore.csv").dropna()
In [3]:
df.head(5)
Out[3]:
App | Category | Rating | Reviews | Size | Installs | Type | Price | Content Rating | Genres | Last Updated | Current Ver | Android Ver | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Photo Editor & Candy Camera & Grid & ScrapBook | ART_AND_DESIGN | 4.1 | 159 | 19M | 10,000+ | Free | 0 | Everyone | Art & Design | January 7, 2018 | 1.0.0 | 4.0.3 and up |
1 | Coloring book moana | ART_AND_DESIGN | 3.9 | 967 | 14M | 500,000+ | Free | 0 | Everyone | Art & Design;Pretend Play | January 15, 2018 | 2.0.0 | 4.0.3 and up |
2 | U Launcher Lite – FREE Live Cool Themes, Hide ... | ART_AND_DESIGN | 4.7 | 87510 | 8.7M | 5,000,000+ | Free | 0 | Everyone | Art & Design | August 1, 2018 | 1.2.4 | 4.0.3 and up |
3 | Sketch - Draw & Paint | ART_AND_DESIGN | 4.5 | 215644 | 25M | 50,000,000+ | Free | 0 | Teen | Art & Design | June 8, 2018 | Varies with device | 4.2 and up |
4 | Pixel Draw - Number Art Coloring Book | ART_AND_DESIGN | 4.3 | 967 | 2.8M | 100,000+ | Free | 0 | Everyone | Art & Design;Creativity | June 20, 2018 | 1.1 | 4.4 and up |
Q1. Total number of apps in each category
¶
In [9]:
categories = {}
for name in df['Category'].unique():
ct = 0
for i in df['Category']:
if(i == name):
ct += 1
categories[name] = ct
for i in categories:
print(i,":" ,categories[i])
ART_AND_DESIGN : 61 AUTO_AND_VEHICLES : 73 BEAUTY : 42 BOOKS_AND_REFERENCE : 178 BUSINESS : 303 COMICS : 58 COMMUNICATION : 328 DATING : 195 EDUCATION : 155 ENTERTAINMENT : 149 EVENTS : 45 FINANCE : 323 FOOD_AND_DRINK : 109 HEALTH_AND_FITNESS : 297 HOUSE_AND_HOME : 76 LIBRARIES_AND_DEMO : 64 LIFESTYLE : 314 GAME : 1097 FAMILY : 1746 MEDICAL : 350 SOCIAL : 259 SHOPPING : 238 PHOTOGRAPHY : 317 SPORTS : 319 TRAVEL_AND_LOCAL : 226 TOOLS : 733 PERSONALIZATION : 312 PRODUCTIVITY : 351 PARENTING : 50 WEATHER : 75 VIDEO_PLAYERS : 160 NEWS_AND_MAGAZINES : 233 MAPS_AND_NAVIGATION : 124
Q2. Total number of apps in each Type
¶
In [5]:
types = {}
for name in df['Type'].unique():
ct = 0
for i in df['Type']:
if(i == name):
ct += 1
types[name] = ct
print(types)
{'Free': 8715, 'Paid': 645}
Q3. Total number of apps in each Content Rating
¶
In [6]:
content_rating = {}
for name in df['Content Rating'].unique():
ct = 0
for i in df['Content Rating']:
if(i == name):
ct += 1
content_rating[name] = ct
print(content_rating)
{'Everyone': 7414, 'Teen': 1084, 'Everyone 10+': 397, 'Mature 17+': 461, 'Adults only 18+': 3, 'Unrated': 1}
In [13]:
df['Rating'].describe()
Out[13]:
count 9360.000000 mean 4.191838 std 0.515263 min 1.000000 25% 4.000000 50% 4.300000 75% 4.500000 max 5.000000 Name: Rating, dtype: float64
In [12]:
df['Reviews'].describe()
Out[12]:
count 9360 unique 5990 top 2 freq 83 Name: Reviews, dtype: object
In [ ]: